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Abstract. The aimed high sensitivities and large fields of view of the new generation of interferometers impose 
to reach high dynamic range of order '-lilO'' to 1:10* in the case of the Square Kilometer Array. The main 
problem is the calibration and correction of the Direction Dependent Effects (DDE) that can affect the electro- 
magnetic field (antenna beams, ionosphere, Faraday rotation, etc.). As shown earlier the A-Projection is a fast 
and accurate algorithm that can potentially correct for any given DDE in the imaging step. With its very wide 
field of view, low operating frequency (~ 30 — 250 MHz) , long baselines, and complex station-dependent beam 
patterns, the Low Frequency Array (LOFAR) is certainly the most complex SKA precursor. In this paper we 
present a few implementations of A-Projection applied to LOFAR that can deal with non-unitary station beams 
and non-diagonal Mueller matrices. The algorithm is designed to correct for all the DDE, including individual 
antenna, projection of the dipoles on the sky, beam forming and ionospheric effects. We describe a few important 
algorithmic optimizations related to LOFAR's architecture allowing us to build a fast imager. Based on simulated 
datasets we show that A-Projection can give dramatic dynamic range improvement for both phased array beams 
and ionospheric effects. We will use this algorithm for the construction of the deepest extragalactic surveys, 
comprising hundreds of days of integration. 



1. Introduction: LOFAR and the direction 
dependent effects 

With the building or development of many large radio tele- 
scopes (LOFAR, EVLA, ASKAP, MeerKAT, MWA, SKA, 
e-Merlin), radio astronomy is undergoing a period of rapid 
development. New issues arise with the development of 
these new types of interferometer, and the approximations 
applicable to the older generation of instruments are not 
valid anymore. Specifically, they have wide fields of view 
and will be seriously affected by the Direction Dependent 
Effects (DDE). Dealing with the DDE in the framework 
of calibration and imaging represents an unavoidable chal- 
lenge, on both the theoretical, numerical and technical as- 
pects of the problem (see .Bhatnagar.,2009. for a detailed 
review of the problems associated with calibration and 
wide field imaging in the presence of DDE). 

This is particularly true for the Low Frequency Array 
(LOFAR). It is an instrument that observes in a mostly 
unexplored frequency range {i^ < 240 MHz), and will be 
one of the largest radio telescopes ever built in terms of 
collecting area. LOFAR's design is built on a combina- 



tion o f phased array and interferometer (see Ide Vos et al 
(|2009l) :'or a description of the LOFAR system). It is made 
of 40 stations in the Netherlands, and 8 international sta- 
tions (5 in Germany, 1 in France, England, and Sweden). 
The High Band Antenna stations (110-240 MHz, HBA 
hereafter) are made of 24 to 96 tiles of 4 x 4 antenna coher- 
ently summed, while the Low Band Antenna (10-80 MHz, 
LBA) are clustered in groups of 96 elements. At the station 
level, the signals from the individual antennas or tiles (in 
the cases of LBA and HBA respectively) are phased and 
summed by the beamformer. This step amounts to forming 
a virtual antenna pointing at the targeted field location. 
The data is transported from the various stations of the 
LOFAR array to the correlator. The whole process and the 
pi peline architecture have been described in more details 
(|2010ll 



Heald et al 



LOFAR is affected by many com- 
plex baseline-time- frequencjBl dependent DDE, consisting 
mainly of the antenna/station beams and the ionosphere, 
which varies on angular scales of degrees and time scales 
of ^ 10 minutes and ^ 30 seconds respectively. We cur- 
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rently have models of both the high-band and low-band 
station beams (H BA and LB A resp e ctively ). 

As shown in iBhatnaear et al. (j2008l) A-Projection 
allows to estimate sky images, taking into account 
all the possib le complicated effects associated to the 



DDE (see als o 'Bern ardi et alj [ioilUMitchell et al.ll2012 : 
Sullivan et al. 2012, in the context of the Murchison 
Widefield Array and forward modeling) . However contrar- 
ily to dishes-based interferometers, where the beam shape 
and polarization angle are affected by pointing errors and 
rotated on the sky by the parallactic angle (depending on 
the dish mount), LOFAR is based on phased arrays that 
have very wide fields of view (up to ~ 12 degrees), non- 
trivial and quickly varying beams, thereby driving compli- 
cated polarization effects. Technically speaking, the very 
wide fields of view instruments that aim to reach high dy- 
namic range, have to deal with baseline-dependent non- 
diagonal Mueller matrices (see Sec. [5] for a detailed dis- 
cussion). For the VLA implementation, due to the ap- 
proximate Unitarity of VLA beams, it was sufficient for 
A-Projection to take into account the diagonal terms of 
the Mueller matrices to demonstrate corrections for in- 
strumental polarization. That is not possible for LOFAR 
that has heavily non-diagonal baseline-associated Mueller 
matrices, and all 4 x 4 Mueller terms have to be taken into 
account. 

We show in this pa per that the scheme described in 
Bhatnagar et al.l (|2008l ) can indeed deal with the heavily 
non-unitary beams associated with the very wide field of 
view of phased arrays. Our imaging algorithm could take 
as input any model or calibration solution or ionosphere 
phase screen. In Sec.[5]we describe the issues related with 
the usage of phased arrays in interferometers, and focus 
on the LOFAR- related issues i.e. the polarization aspects 
and baseline dependence of the DDE. We describe a few 
important algorithmic optimizations related to LOFAR's 
architecture allowing us to build a fast image 10. In Sec. El 
we describe the A-Proj ection algorithm first presented in 
Bhatnagar et al.l ( 20081 ). and detail the various implemen- 



tations and optimizations we have found to make it rea- 
sonably fast in the case of LOFAR. We present the results 
in Sec. S] and show that beam and ionosphere corrections 
can both be performed at high accuracy. We summarize 
and discuss the next developments in Sec. [5l 

2. Polarization effects associated with very wide 
fields of view interferometers 

In this section we describe the polarizations effects asso- 
ciated with the complex structure of the DDE inherent 
to the usage of phased arrays that have very wide fields 
of view and non-diagonal baseline-dependent Mueller ma- 
trices (or non-imitary Jones matrices in the case of in- 
terferometers having similar antennas, see Fig. [T] and the 
discussion in Sec. 12.11) . 



^ Our software {awimager) is built on the Casa imager im- 
plementation. 



With its long baselines, large fractional bandwidth, 
very wide field of view and station-dependent effective 
Jones matrices (beam, ionosphere, Faraday rotation), the 
Mueller matrices to be considered are not only non- 
diagonal, but are also baseline-dependent. In order to 
highlight some of the main complications associated with 
very wide fields of view instruments, in Sec. 12.11 we de- 
scribe in detail the structure of t he linear operator intro- 
duced bv lBhatnagar et al] ( 2008 ). We propose in Sec. 12.21 
a method to approximately correct for t he associated ef- 
fects corrupting the image plane (see also lRau fc Cornwell 



201 iL for other examples of the use of linear operator in 



the context of image synthesis and deconvolution). 

2.1. Description of the baseline-dependence, DDE and 
polarization effects 

For convenience, in this section and throughout the paper, 
we do not show the sky term \/l ~ P — m? that usually 
divide the sky to account for the projection of the celestial 
sphere on the plane, as this has no infiuence on the results. 
The DDE below are baseline-dependent, as it is the case 
for LOFAR, and the Measurement Equation formalism 
can properly model those effects (for extensive discussions 
on validity and limitations of the Meas urement Equation 



Hamaker et al. 1996t Smirnovll201ll and see Appendix 



mea^ iS the Set 



see 

El for a short review of our needs) . If 
of 4-polarization measurement (XX, XY, YX, YY), Gp 
is the direction independent Jones matrix of antenna-p, 
then the corrected visibilities on baseline pq are Vp°^^ — 
[Gtu,p]-^ y^i'"' .[GgX^ , and we can write: 



Vec(F--) = SiD^s 



.Vec(/« 



. exp {—2m(j)[u, v, w, s))ds 



(1) 



where / is the 4-polarization sky, ig) is the Kronecker 
product, Vec is the operatoJUl that transforms a 2x2 
matrix into a dimension 4 vector, and (j){u,v,w, s) = 
u.l + v.m+w.{-\/l — P— m?^\) models the product of the 
effects of correlator, sky brightness and array geometry. 
The matrix (g) is a 4 x 4 matrix, and throughout 
the text we refer to it as the Mueller matri^o- We can also 
write Eq. [T]in terms a series of linear transformations: 



* pq PQ PQ PQ 



(2) 



.^.^ --'P5 visibility measurement points 



where V^^j"" are the AN*^" 
in the time-frequency block in which the direction depen- 
dent effects are assumed to be constant. If Npi^ is the 
number of pixels in the sky model, the true sky image 
vector / has a size of 'iNpix and contains the full polariza- 
tion information {XXx,XYx,YXx,XYx) on the x^^ pixel 
at the 4a: position. I?*'^ contains the direction dependent 
effects, and is a {ANpix) x [ANpi^) block diagonal matrix. 



^ This is not completely true as traditionally the Mueller 
matrix multiplies an (I, Q, U, V) vector and not an (XX, XY, 
YX, YY) correlation vector. 
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Fig. 1. LOFAR stations are phased arrays characterized by a large field of view. The X/Y polarimetric measurements 
are therefore non trivial standard compared to a radio telescope using dishes: we have to take into account the 
projection of the dipoles that are generally non-orthogonal on the sky and the fact that this angle varies across the 
field of view. This figure shows the Mueller matrix corresponding to baseline (01) in a given time and frequency slot, 
normalized by the Mueller matrix at the phase center (see text). Each pixel in the plot shows the ampHtude of 
the Mueller matrix element in a certain direction s using a logarithmic scale. Even in this normalized version, 
the off-diagonal Mueller terms are as high as 10% and cannot be neglected. 



On a given baseline (p,q), each of its 4 x 4 block is the 
T^pqi^x) — D*{sx) (8i Dp{sx) Mueller matrix evaluated at 
the location of the x*'^ pixel. F is the Fourier transform 
operator of {iNpi^) x (4iVpi2;). Each of its (4 x 4) block is a 
scalar matrix, the scalar being the kernel of the Fourier ba- 
sis exp {—2iTr<j){u, v, w, s)). The matrix Sp'^ is the uv-plane 
sampling function for that visibility of size iNp'^ x ANpix, 
and Wpg^ is the diagonal ANp^ x ANp^ matrix containing 
the weights associated with the AN. 



p'^ visibilities. 



We show in Fig. [T] that the Mueller matrix is non- 
diagonal for LOFAR. The various panels show the am- 
plitude of the direction dependent Mueller matrix us- 
ing our beam model only (therefore it does not include 
Faraday rotation and ionosphere), for the baseline (p, q) = 
(0, 1) in a given time and frequency slot. In order to 
minimize the off-diagonal elements, we have computed 
I?o(s) = ['D{sq)]~^ .Vis) in the direction s, where Sq is the 
phase center direction. Intuitively, the normalization of 
the Mueller matrix by ^{so) makes the projected dipoles 
on the sky orthogonal at the phase center (off-diagonal 
elements are zero there) , while this gets less true as we go 
further from the targeted location. The off-diagonal ele- 



ments can be as high as ^ 10% a few degrees from the 
pointed location and contrarily to most interferometers 
they cannot be neglected in the case of LOFAR. 

We are generally interested in using the total set of vis- 
ibilities over baselines, time and frequencies, having 4A'^v' 
points. We can write: 



V = A.I 



(3) 



where e is noise, and A is a (ANv) x {4:Npix) matrix, made 
of the W^'g'' .Sl'^ .F.Vl;^ on top of each other (each have di- 
mension 4:Ni,iock X "iNpix^ where Nuock is the total number 
of time- frequency blocks). We can also write A = WSJ-V, 
where V has size '^NuocksNpix x Npix, and has all the 
I?*'^ on top of each other. F is the block-diagonal Fourier 
transform operator with size 4:NuocksNpix x ^NuocksNpi^, 
with all of its blocks ANpix x ANpix being equal to the 
matrix F appearing in Eq. [21 S and W are the sampling 
and weights matrix of sizes {ANy) x {'^NuocksNpix) and 
[ANv) X {ANv) respectively. The transformation of Eq.[3]is 
exact. Estimating a sky from a sparsely sampled measured 
set of visibilities is less trivial. 
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Fig. 2. This figure shows the correction {T>q .Vo) ^ that can be apphed to the image before the minor cycle. This is 
a first order correction for the complicated phased array beam, that depends on time, frequency and baseline. 



2.2. Estimating a sky image: polarization effects 

There are different ways to solve for / from the set of 
visibilities, and reversing Eq. |3] relies on the linearity of 
the sky term in the measurement equation. As mentioned 
above our deconvolution scheme uses A-Projection. This 
generalization of CS-CLEAN is now better understood in 
the framework of c ompressed sensing, which include other 
new techniques (see McEwen fc Wiati?3l2011 . for a review). 
Compressed sensing provides new ways to estimate the sky 
/ for the set of visibilities. In this section we describe a 
method to approximatly correct the polarisation effects in 
the image plane. In order to highlight the issues associated 
with polarization and baseline-dependence of the DDE, we 
simply write the sky least square solution / as the pseudo 



/ = {A".Ay\A''.V 



(4) 



is the AN, 
X (47V, 



and 



p„ dirty image, 
pix) image plane decon- 



where the term .V 
(A^.A)-^ is the {ANpr. 
volution matrix. Its structure is rather complicated and 
its size makes its estimate prohibitive. 

In the simple case of a matrix V being unity (no DDE), 
we can see that each 4x4 block number {x, y) of the matrix 
A^ .A is the instrumental response to a source centered at 
the location of the x^^ pixel, evaluated at the y*^ pixel. 
Therefore, computing {A^ .A)~^ would involve estimat- 



ing point spread function (PSF) centered at the location 
of each pixel and inverting a -iNpix x ANpix matrix. In the 
presence of non-trivial baseline-dependent DDE involving 
non-diagonal Mueller matrices, the problem becomes more 
complex. However, we show below that under some as- 
sumptions, the operator {A^ .A)'"^ (sometimes called the 
deconvolution matrix) affected by DDE is decomposable in 
a classical deconvolution matrix (containing information 
on the PSF), and a simple correction performed separately 
on each pixel. 

Following the notation introduced above, we can write 
A" .A = V"VV as a ANpi^ x 47Vp„ matrix, with P = 
F"S"W"WSF of size muocksNp,, x muocksNp^,. This 
later matrix is block diagonal, and each of its ANpix x ANpix 
block describes the PSF of the instrument for a given 
baseline-time-frequency block. Their 4x4 xy-block are 
scalar matrices, the scalar pt,v.pq{x,y) being the response 
of the instrument evaluated at the y^^ pixel to a source 
being centered at the x^^ pixel in the given baseline-time- 
frequency block. We then have: 



[A" .A]{x,y) = ^ Pt,,.,pq{x,y)V^^^pg{y)Vt,^,pq{x) 



(5) 



It is virtually impossible to compute this matrix, and 
this illustrates the difficulty of doing an image plane de- 
convolution in the presence of time-frequency-baseline de- 
pendent DDE. In order to apply a first order correction 
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to the image plane, we assume that the direction depen- 
dent effects are constant enough across time, frequency, 
and baseline. Then we can write: 



[A".A]{x, y) ^ Nt,,^pgp{x, y)VHV{x, y) 

where Nt^i,,pq is the number if basehne-time-frequency 
blocks, 'D^T>{x, y) is a 4 x 4 matrix being the average 
of Vp'^ (y)^ .Vp'^ {x) over baselines, time, and frequency 
and p(x, y) is the PSF stacked in baselines, time, and fre- 
quency. If the uv-coverage is good enough, then .A is 
block diagonal {p{x,y) = for x ^ y, p{x,y) = 1 other- 
wise). All the X ^ y terms cancel out in the final product, 
and in the relevant part of V^T) matrix are the Npi^ Ax A 

blocks on the diagonal. Applying V^V to A^ .V can 
then be done on each pixel separately, by computing the 
product 'D^'D{x, x)~^ .Ix where contains the full po- 
larization information {XX^, XY^, YX^, XY^) for the x*^ 
pixel. This provides a way to estimate an approximate 
least square clean component value from a flux density in 
the dirty image in a given direction ■ The details of this 
image plane normalization are further discussed in Sec. 
13.11 Although this normalization underlies a few assump- 
tions, from the simulations presented in Sec. 14.31 we argue 
that the complex 4x4 matrix normalization per pixel pre- 
sented here brings clear improvement. However it does not 
seem necessary for the sky solutions to convergence (see 
SecO. 



3. Implementation of A-Projection for LOFAR 



As explained above. 



the full polarization A-Pr o iectiq n 

]2008h . 



scheme has been described in Bhatnaear et al. 



However for the VLA implementation, due to the ap- 
proximate unitarity of VLA beams, only the diagonal 
Mueller matrix terms had to be taken into account. As 
explained in Sec. [U LOFAR has got very wide fields of 
view, and the baseline-dependent Mueller matrices asso- 
ciated to the 4-polarizations correlation products are non- 
diagonal. This basically means that each individual polar- 
ization cannot be treated independently from the others. 
In this section we describe in detail a few implementations 
of A-Projection allowing to correct for the non-diagonal 
Mueller matrices. We propose optimizations in relation to 
the telescope architecture. We will show in Sec.|4]that this 
algorithm can indeed deal with the heavily non-diagonal 
Mueller matrices associated with the LOFAR's wide fields 

of view. 

Following HhatnagareTaD (|2008l) . we have build our 
implementation of A-Projection on t op of a tradi tion 
Cotton & Schwab CLEAN algorithm (jSchwabl [l983) . It 



is an iterative deconvolution algorithm that consists of 
two intricated steps. The CLEAN and A-Projection al- 
gorithms performs a series of operations that can be de- 
scribed in the following way: 



where the true sky estimate at step n -1-1 is built from 
its estimate /" at step n and $ is a non-linear operator. 
It basically estimates the deconvolved sky from the resid- 
ual dirty image A^ {V — AI^). The minor cycle performs 
operations that approximate {A^ A)~^ discussed in Sec. 
[51 and takes the estimated image vector to zero but at 
the strongest components above a certain threshold. For 
the sky solutions to converge, the predict step [AI"' or de- 
gridding) has to be accurate and unbiased. A-Projection 
is a fast way to apply A or A^ . If we separate the phase 
of the Fourier kernel in Eq. [l]as the sum of w, s) — 
u.l +v.m and (jy^jw, s ) = w. {\/\ — P — m? — 1), then folow- 



ing Bhatnaear et al. ( 2008h invoking the convolution the- 
orem Eq. [1] becomes: 



Vpq{u,V,W,i) 



:F \ Y,Vl;:;{^,J,s)W{w,s)I,{l,m) 



E 



F{V'p:;{i,3,s)W{w,s)) 
* {l,m)) 



(7) 



(6) 



where * is the convolution product, is a 2D Fourier 
transform, W{'w^s) = exp (— 2z7ru;.(-\/l — P — m? — 1)), 
and i and j index the polarization number (running over 
(XX, XY, YX, YY)). This method is efficient because the 
DDE are smooth on the sky, meaning the support of the 
corresponding convolution function can be small (no high 
frequency terms). Indeed, as shown in Fig. [1] the LOFAR 
station beam is very smooth, and depending on the field 
of view the typical support is on the order of 5-11 pixels. 

3.1. Naive implementation 

The operation Al in Eq. [S] converts a sky model into 
visibilities. To apply this operator on a massive amount 
of data in an algorithmically efficient way, we apply the 
scheme outlined in Eq. [T] First (i) a 2-dimensional fast 
Fourier transform is applied to the sky model image /, 
on the 4 polarizations independently. Then (ii) for each 
baseline in a time-frequency block A(<, v) where the DDE 
is assumed to be constant, the 16 convolution functions 
are computed as the Fourier transform of the image plane 
Kronecker product of the DDE (for the LOFAR beam on 
a given baseline this block is typically 10 minutes and 0.2 
MHz wide). The residual visibilities in each polarization 
are interpolated from the sum in Eq. [71 In practice, in or- 
der to minimize the support of the W-term and associated 
aliasing effect, we multiply the DDE in the image plane by 
a Prolate spheroidal function (see Appendix [Bl for more 
details). 

The predicted visibilities are removed from the mea- 
sured visibility by computing the residual visibilities 
Residual ~ V—Al'^. Applying A^ applies a correction to 
the residual visibilities, and projects the result onto a grid 
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Fig. 3. The essential part of the A-Projeetion algorithm relies in the predict step, which transforms a 2D sky image 
(the projection of the celestial sphere on a plane) into a set of visibilities. We have simulated a dataset having one 
off axis source, where the true visibilities (black dashed line) have been estimated using Eq. [T] taking the beam into 
account. This plot shows the comparison in all measured polarizations between the exact value of the visibility of a 
given baseline, and the A-Projection estimate (gray line). Contrarily to a traditional predict step, the visibilities are 
modulated by the beam amplitude (dotted line) and we have time-dependent polarization leakage. The over-plotted 
graph shows a zoom in the small region shown on the top-right panel. In the degridding step, we use a computationally 
efficient closest neighbor interpolation, creating steps in the predicted visibilities. 



(the gridding step) and Fourier transforms it. In practice 
this is done as follows: 



Vr 



COTT. u.v.w.i 



* V 



residual, M,tij 



(8) 



and 



(9) 



The resulting dirty image is still corrupted by the 
phased array beams related effects discussed in Sec. 
Before the minor cycle we can either multiply each 
4-polarization I^Yuiyi^) pixel by the 4x4 matrix 



'D^T?(x) ^, or simply normalize each polarization of 

-^dirty T^^T^uix), the diagonal elements of V^V^x). 

As shown in Sec. 14.31 fully applying V^Vlx)^^ to each 
pixel in Ifj^ij-^y shows a minor improvement. 



The computational cost of taking DDE into account 
using A-Projection depends on (i) whether it is baseline 
dependent, (ii) the angular size at which the effect needs 
to be sampled (thereby constraining the size support of 
the convolution function) and (iii) the amplitude of the 
terms of the 4x4 I?p'^(s„) matrix. In the case of the 
full polarization A-Projcction, the data needs to be cor- 
rected per baseline, per time and frequency slot. For each 
of those data chunk, in order to recover the corrected 4- 
polarization visibilities, one needs to take into account the 
16 terms of the 4x4 'Dp^{x) Mueller matrix, and the 
4 visibilities built from the 2D Fourier transform of the 
sky model. Therefore in addition to the 16 convolution 
function estimate per baseline and time-frequency slot, 
in the gridding and degridding steps, one needs to com- 
pute 16 X Ng operations per 4-polarization visibility point, 
where Ns is the support of the convolution function. The 
algorithmic complexity is further discussed in Sec. 14.41 but 
this implementation is too slow to be practical. 
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3.2. Separating antenna beam pattern and 
beam-former effect 

Depending on the assumptions and system architecture 
one can find powerful algorithmic optimizations. We show 
here that in the case of LOFAR, we can use the fact that 
althought stations are rotated one to another, the ele- 
mentary antennas are parallel. The effective phased ar- 
ray beam Bp s of station p is modeled as Bp s — ap s.Ep s, 
where ap^s is the array factor, and Ep^s is the element beam 
pattern. The term ap^s depends on the phased array geom- 
etry and on the delays applied to the signal of the individ- 
ual antennas before the summation (by the beam-former 
of each individual LOFAR stations). The term Ep s mod- 
els both the individual element antenna sensitivity over 
the sky and its projection on the sky. We have 

Vec{V;°n = JVec{ap,s.Ep.s.Is.E^,sKs) 

s (10) 
. exp (— 2i7r0(u, v, w, s))ds 

with ap,s being scalar valued and Ep^s is non-diagonal be- 
cause (intuitively) the element beam projection on the 
sphere depends on the direction. Applying the convolu- 
tion theorem to Eq. [TU] we obtain: 

Vec(T/--) = J^[Els ® Ep^s] 

* J'[ap^s.a* „.exp (-2i7r(?!)i(w,s))] ^^1) 

* / /g. exp (— 2i7r0°(u, w, s))(is 
s 

All LOFAR stations have different layout on the 
ground, so the scalar valued array factor 

^p.s IS sta- 
tion dependent. However all the individual antenna of 
all stations point at the same direction and we can as- 
sume that the Mueller matrix is baseline independent ie 
E* s® Ep^s = -E'o s ® Eq s- This requires an additional cor- 
rection step of the gridded uv-plane visibilities but as we 
will see below, this is an interesting algorithmic shortcut 
because the element-beam correction can be applied on 
the baseline-stacked grids. 

The element beam is very smooth over the sky and in 
most cases it can be assumed to be constant at time-scales 
of an hour, so that the polarization correction step does 
not need to be often applied. The degridding step goes as 
follows: (i) in each time-frequency slot A(t, i>)e where the 
Mueller matrix of the element beam is assumed to be con- 
stant, the polarization correction is applied to the (XX, 
XY, YX, YY) grids as the sum of convolved grids by the 
terms of Eq ^iSiEq^s- We then loojUover baseline (pq), and 
time-frequency range A(i, iy)a where the array factor and 
w-coordinate are assumed to be constant within A{t, v)e. 
For each step in the loop (ii) the oversampled convolu- 
tion function for baseline {pq) is estimated in A(t, i')a for 
the term of the second line of Eq. [11] and (iii) it is used 
to interpolate the predicted visibilities at the given uv- 
coordinate, separately on each polarization. 

^ We can parallelize the algorithm at this level. 



As explained in Sec. 14.41 the computing time for es- 
timating the convolution functions can be quite signifi- 
cant, and this scheme allows us to compute one convo- 
lution function per baseline instead of 16, and 4 grid- 
ding/ degridding steps instead of 16. We note however that 
the assumption of baseline independence of the Mueller 
matrix on which is based this optimization starts to be 
wrong in the cases of direction dependent differential 
Faraday rotation, or for the longer baselines where the 
curvature of the earth starts to have an importance (in 
that case the element beam are not parallel) . As discussed 
in Sec. 14.41 this computing time of this implementation is 
dominated by the convolution function estimate. 

3.3. Separating the W-term: hybrid w-stacking 

The support of the A-term is determined by the minimum 
angular scale to be sampled in the image plane. The beam 
or ionospheric effects are in general very smooth on the sky 
so that little amount of pixels are needed to fully describe 
the effects, therefore corresponding to a small convolution 
function support size (typically 11x11 pixels). The highest 
spatial frequency in the image plane is the W-term and 
its support can be as big as ~ 500 x 500 pixel for the long 
baselines, wide fields of view, when the target field is at 
low elevation. This forces us to (i) compute a convolution 
function with a large support, and (ii) grid each individual 
baseline using the W-term dominated large convolution 
function. 

We note however that the W-term is in itself baseline 
independenl|f|: two baselines characterized with different 
ionosphere, beams, but with the same w-coordinate will 
have the same W-term. We therefore here slightly change 
the piping of the algorithm by taking into account the 
A- Term and the W-term separately as follows: 

Yec{V;'°-n = T[E*s ® Es] 

* J'[exp [-2iTl(l)^(Wplane, «))] 

* J'[ap,g.a* exp (-2j7r0i(Aw, s))] ^^"^^ 

* / /g. exp (— 2i7r(/)"(u, u, s))(is 
s 

We consecutively grid or degrid the data in w-slices ie 
that have similar w-coordinates. This algorithm is also 
known as W-stacking. In addition, we take into ac- 
count the fact that the points can lie above or below 
the associated w-plane central coordinate using the term 
exp (— 2i7r0^(Aw, s)), where Aw = w — Wpiane- This step 
is similar to the traditional w-projection algorithm. If we 
have enough w-stacking planes, Au; is small, and the sup- 
port of the baseline-time-frequency dependent convolu- 

^ We talk about baseline dependence when a set of base- 
lines with exactly the same uvw coordinates can give different 
visibilities. 

^ See for example Maxim Voronkov's presentation at 
http://www.astron.nl/calim2010/presentations in the context 
of ASKAP 
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tion function remains small, leading to a dramatic de- 
crease of the total convolution function estimation time. 
Conversely, given a convolution function support we can 
find the maximum usable Aw and derive the number of 
w-stacking planes as a function of the observation's maxi- 
mum w coordinate (see Sec. l4.4.51 for more detailed discus- 
sion) . In the case of LOFAR, choosing a convolution func- 
tion support of ^ 21 pixel gives a number of w-stacking 
planes of ^ 30. 

This requires yet an additional step as compared to 
the implementation described in Sec. 13.21 We describe be- 
low the degridding step AI^. First, following the nota- 
tion introduced above (i) in each time-frequency interval 
A{t, v)e, we correct the 4-polarization grids from the ele- 
ment beam (including projection effects) using Eq ^^Eq^s- 
Then (ii) we loop over the number of w-planes (rang- 
ing from —vfmax to ^max, sec Sec. [C]), and convolve the 
grid obtained in (i) by the associated w-term (appear- 
ing in the second line of [T^ . Finally in (iii) for each 
w-plane obtained in each step of the loop (ii), we loop 
over the set of baselines (jpq)w and time-frequency range 
A(t, z^)a,w associated with the given w-plane. We interpo- 
late the predicted visibilities at the given uv-coordinate, 
separately on each polarization, based on the oversam- 
pled ^"[0^^3.0*^3. exp (— 2i7r(/)-'^(Aw, s))] convolution func- 
tion. As discussed in Sec. 14.4.31 this is the fastest imple- 
mentation of A-Projection we have obtained so far. 

4. Simulations 

In order to test the algorithmic framework described above 
we have performed a series of tests on LOFAR simu- 
lated datasets. In this section we summarize those results 
and discuss the computational costs of A-Projection for 
LOFAR. 

4.1. One off-axis heavily polarized source 

As discussed above by using A-Projection we can com- 
pute the exact value of the 4-polarization visibilities on a 
given baseline at a given time and frequency, from (i) the 
sky model and (ii) the baseline-time-frequency-direction 
dependent effects. In a first step, we focus on testing the 
degridding full polarization A-Projection degridding (or 
predict) step {A.I). The accuracy of this step is indeed 
vital for the convergence of the CLEAN/ A-Projection al- 
gorithm. Our experience suggest that any small numerical 
systematic bias in this operation can lead to strong diver- 
gence of CLEAN. 

In order to test this transformation, we have simu- 
lated a dataset having only one polarized ofF-axis point- 
like source in a low-band 62 MHz LBA dataset. The 4- 
polarization visibilities have been generated using BBS 
(BlackBoard SelfcaQ), which computes a direct calcula- 



tion of the visibilities following Eq. [TJ It takes into account 
the beams of both the individual elementary antenna (and 
their projection on the sky) , and the phasing of the indi- 
vidual antenna within a LOFAR station (the array fac- 
tor). We located the simulated source a few degrees from 
the phase center and its flux density is non-physically po- 
larized (with stokes parameters of I,Q,U,V=100,40,20,10 
Jy). Fig. |3] shows the real part of BBS and A-Projection 
predicted visibilities on the baseline (01). The residuals 
are centered at zero, and A-Projection performs very well 
in predicting the 4-polarization visibilities, taking into ac- 
count the complicated effects of the LOFAR phased array 
station's beams. A traditional predict using simple Fourier 
transform, facets or W-Projection would suffer from much 
higher residuals, driving systematics in the deconvolution, 
thereby limiting the dynamic range. Here the residual er- 
rors are dominated by the interpolation type we use in the 
uv domain (closest neighborhood, see |B] for details). 

4.2. Dataset with many sources 
4.2.1. LOFAR station beam 

In order to test our modified implementation of the 
whole CLEAN algorithm (involving gridding and degrid- 
ding steps), we have simulated a dataset containing 100 
sources, with the source count slope followi ng the 1.4 



GHz NVSS source count (|Condon et al.lll998l) . As in the 



dataset described above, the visibilities are generated us- 
ing We have taken into account the Jones ma- 
trices of the individual elementary antenna as well as 
the array- factor (the beam- forming step). As explained 
in Sec. 13.21 and shown in Fig. [3l as the LOFAR stations 
are rotated one to another, all baselines will be effected 
by beam effects differently. We have corrected the visi- 
bilities for the beam effect at the first order by comput- 
ing = [Dl;'']7„Xi''iDl''''"]7o\ where Z)p(so) is the 
Jones matrix of station p computed at the center of the 
field. This mostly compensates for the element beam ef- 
fects, and more specifically the projection of the dipoles 
on the sky. However, as shown below, the LOFAR fields 
are big, and the projection of the dipoles vary across the 
field of view. Most of the sources thereby generated have 
fiux densities comprised between 10~^ and 1 Jy. We have 
added two bright sources of 100 and 10 Jy. 

Fig. m shows the restored images produced using three 
different modified CLEAN algorithm. Each of those im- 
age contains ~ 3000 pixel in side, and is ^ 6 degrees 
wide. We have used 15 w-stacking, 128 Aw-planes (see 
Sec. 13. 3|) . a maximum baseline of 5 kA, a BriggsQ weight- 
ing scheme, a cycle-factor of 2.5 and a number of minor 
cycle iterations of 10 000. The first map ha s been generated 
using W-Projection (jCornwell et al as implemented 

m casa|. Strong artifacts are present around the bright- 



® See http: / /www.lofar.org/wiki/doku.php?id=public:docu- 
ments:lofar_documents&s[]=bbs for a review of BBS function- 
alities 



^ With a robust parameter of 
http: / /www.aoc. nrao.edu / dissertations / dbriggs 
* http://casa.nrao.edu 
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Fig. 4. This figure shows the dramatic effect of the LOFAR phased array beams for a simulated dataset. Specifically, 
the visibilities have been generated taking into account (i) the individual antennas, (ii) their projection on the sky and 
(iii) the beam- forming step (the scalar array factor) . The top- left image shows the deconvolved sky as estimated with 
a traditional imager not taking into account time-frequency direction dependent effects. The top-right and bottom- 
left have been generated by taking into account the array factor only and both array factor and the element beam 
respectively. The bottom right panel shows the input flux densities are correctly recovered. 



est off-axis source, and the dynamic range reaches 1 : 230. 
In the second image we have used our implementation of 
A-Projection taking into account the array factor only. 
Taking that effect into account the effect of the lower the 
residual visibility levels on each individual baselines and 
the dynamic range reaches ^ 1 : 3.400. In the third image 



we have taken into account all the LOFAR beam effects: 
the individual antenna sensitivity, their spatially varying 
projection, and the array factor. The dynamic range in- 
creases to - 1 : 12.00(11 

^ It seems the output images of our imager is presently lim- 
ited at ~ 10~* accuracy for some numerical precision prob- 
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The sources flux densities in the restored maps have 
been extracted using the LOFAR source extraction soft- 
ware As shown in Fig. |4l the input flux den- 
sities are very weh recovered. 

4.2.2. LOFAR station beam and ionosphere 

In order to test the ionospheric correction with A- 
Projection, we have simulated a dataset containing 100 
sources, affected by a simulated ionosphere. The iono- 
spheric effects essentially consist of a purely scalar, direc- 
tion dependent phase. Faraday rotation is not included. 
In addition to this purely scalar ionospheric phase ef- 
fect the visibilities are affected by the direction dependent 
LOFAR's stations beam effects discussed above. 

The ionosphere is modeled as an infinitesimally thin 
layer at a height of 200 km. The Total Electron Content 
(TEC) values at a set of sample points are generated by a 
ve ctor autoreg r essive (var) random process. As described 
in Ivan der ToJ (|2009h the spatial correlation is given by 
Kolmogorov turbulence. The values at intermediate points 
are found using Kriging interpolation. The set of sample 
points are the pierce points for five points in the image 
located at the four corners and the centre. This way the 
ionospheric layer is sampled in the most relevant area. 
There are at least five sample points within the field of 
view of each station. 

Fig. [5] shows the dirty image at the location of a given 
source before and after A-Projection correction of beam 
and ionosphere. This suggets that the dirty undistorted 
sky is properly recovered from the corrupted visibilities. 
We compare in Fig. [5] the cleaned image with and without 
the A-Projection correction. Those simulations demon- 
strate that A-Projection can indeed deal with the com- 
plicated effects associated with the ionosphere. 

4.3. Convergence 

We have also studied the influence of the various correc- 
tions described in this paper on the convergence speed 
of the estimated flux densities through the major cy- 
cle loop. For this we have used the dataset described 
in Sec. 14.11 containing one off-axis source having stokes 
(IQUV)=(100,40,20,10) Jy. 

Fig. [7] shows the evolution of the estimated flux den- 
sity as a function of the major cycle iteration number for 
different algorithm. In the first panel, the W-Projection al- 
gorithm somewhat converges to the observed flux density 



lem: all the code uses single precision floating point numbers, 
and the roundings of products and sums involved in the algo- 
rithm seems to generate limitations at this level. Detailed tests 
based on multi-threading and single threading comparison con- 
sistantly reveal an instability at this level. For the LOFAR sur- 
veys however, we might not need higher accuracy, as we plan 
to use direction dependent pealing to substract the brightest 
sources on the first ~ 1:10^ dynamic range. 

See http://www.lofar.org/wiki/doku.php for more infor- 
mation. 



Fig. 5. This figure show the dirty image at the loca- 
tion of a bright source in a simulated dataset. The im- 
ages are 1 deg in diameter. In the image of the left panel 
the dirty image shows important distortion as a result 
of ionospheric effect. The A-Projection correction (right 
panel) for an ionospheric phase screen clearly shows im- 
provement. 



(to the "beam- multiplied" sky). The situation is better 
using full polarization A-Projection without the element 
beam (therefore assuming the dipole projection on the sky 
are constant across the field of view). We note that in the 
absence of polarization, the situation is not as bad. Taking 
the element beam into account, the algorithm makes the 
estimated flux densities to converge to the true values to 
better than a percent. In this version of the algorithm, 
the image plane correction is the same on all polarization 
and is just a scalar normalization (the average primary 
beam normalization) . The situation gets slightly better in 
terms of convergence speed by applying the image-plane 
renormalization described in Sec. 12.21 

In any case the accuracy of the recovered flux densities 
seems to be guaranteed by the accuracy of the degridding 
step. Our experience in implementing A-Projection sug- 
gest that any small systematic error in the degridding step 
can lead to biases in the recovered flux densities or diver- 
gence in the more severe cases. The image-plane polariza- 
tion correction, seems to be bringing something positive 
to the convergence speed, but that step appears not to be 
necessary. 

4.4. Computational and memory-related costs 

Given the large amount of data that have to be processed 
in the imaging of a interferometric dataset, reducing the 
algorithmic complexity is of primary importance. 



4.4.1. Memory-related issues 

The A-term is generally very smooth in the image plane, 
with corresponding small support convolution functions 
and under those conditions the A-Projection algorithm 
is virt ually free as explaine d in Bhatnaear et al.l ()2008f l 
(and in Cornwell et al.ll2008l in the case of W-Projection): 
the A-term and the low w-coordinate convolution func- 
tion support is less or comparable to the spheroidal func- 
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Fig. 6. This figure shows the deconvolved image synthesized from the simulated dataset described in Sec. 14.2.21 In 
the left panel, the ionospheric effects are not taken into account, and our deconvolution scheme naturally produces 
severe artifacts and high residuals in the reconstructed sky. The deconvolved image shown in the right panel has been 
estimated using our implementation of A-Projection with the time-dependent ionospheric phase screen. 
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Fig. 7. This figure shows the evolution of the estimated 
flux density as a function of the major cycle iteration for 
one polarized off-axis source. From top to bottom, left to 
right are generated (i) using W-Projection only, (ii) using 
the AW-Projection with the array factor only, (iii) using 
AW-Projection taking the full beam model into account. 
Finally in (iv) we show the effect of doing the image plane 
correction described in Sec. 12.21 



tion support that anyway needs to be applied in a tra- 
ditional gridder/degridder. However, as described in Sec. 
13.21 all LOFAR stations are rotated one to another, and 
the synthesized stations beams are all different in a given 
direction. This gives rise to a serious algorithmic compli- 
cation because the convolution functions become baseline- 
dependent: the number of possible convolution functions 

is 16 X Mimes X A'^Frcqs X A^Stations X (A'stations " l)/2. With 

~ 800 to ^ 1500 baselines, even with a convolution func- 
tion every 10 minutes, a typical observing run of 8 hours, 
one frequency channel, and an average support size of 
^ 30 pixel (taking into account the w-term), this gives 
a ^ 100 Gbytes of needed storage. This is optimistic as 
in the case of other DDE such as the ionosphere, the A- 
term will have to be evaluated every 30 seconds. Even if 
the storage is done at the minimal necessary resolution, 
those numbers are clearly prohibitive for storing the con- 
volution into memory. The convolution functions therefore 
have to be computed on the fiy, and algorithmically, this 
represents an additional cost. 

4.4.2. Naive and element-beam-separated A-Projection 

The LOFAR station's beams are smooth on the sky, and 
the corresponding convolution function support are small 
(typically 11-15 pixel complex image), while the W-term 
needs up to >500 pixel depending on the w-coordinate 
(with an average of ^-^30 pixels). The computing time de- 
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pends on the convolution function support Ns, and for the 
implementations described in Sec. 13. II and 13.21 we have: 



Ns = 



E 



Nsicf) 



cf={S.Ap,A^.,W} 



(13) 



where the subscripts S, Ap, Aq, and W stand for the 
Prolate spheroidal (^ 7 — 11 pixels), A-terms of antenna 
p and q and W-term. For a typical field of view, we have 
Ns{A) = 9-15 and NsiW) « D'^.w, where D is the 
field of view diameter and w is the W-coordinate of the 
given baseline in the given time-slot (see Appendix |B]). For 
that baseline, time and frequency slot we can write the 
total computing time as t^'^ ^^t = tp'^^g^d + ^pq,CJ^ where 
^grid ^^'^ ^CF ^^'^ gridding and the convolution func- 
tion estimate times. In most cases for LOFAR data, we 
have Ns ~ Ns{W) and tg„d 9S D^w'^. Our experience has 
shown that tcF is dominated by the computing time of the 
Fourier transform of the zero-padded convolution function 
(see Fig. [5]), which size is O.Ns where O is the oversam- 
pling parameter which controls the quality of the nearest 
neighborhood interpolation. If N^^ oc l/{AT^in^'^win) 
is the number of visibility buffers associated with the 
W-plane (AT„i„ and Av^in are the time/frequency win- 
dow interval in which the DDE are assumed to be con- 
stant), then we have t^p « N^f.0^w^D^Aog{OwD^)). 



"buf 

The gridding time for a given w-plane is simply t 



w 

grid 



N^^w D where N^^ is the number of visibilities associ- 
ated with the w-plane. The total computing time can be 
written as: 



Neltel + 



W —planes 



at 



w 

grid 



(14) 



where a and h are constants, Nei is the number of 
time/frequency blocks in which the element-beam is as- 
sumed to be constant, and tei cN'^^,^{l + 2 log(7Vpia:)) is 
the computing time necessary to apply the element-beam 
to the grids (c = 16 for full polarisation imaging, c = 8 
for I-stokes only). For the implementation described in 
Sec. 13.11 we have tei = 0, but both a and h are 16 times 
higher (8 for I-stokes only) than in the case of the algo- 
rithm described in Sec. 13.21 Fig. |S] shows the gridding and 
convolution function estimate times as a function of the 
w-coordinate of the given baseline in the given time-slots. 
From that figure it is clear that (i) the W-term is the most 
important limiting factor and (ii) the estimate of the con- 
volution function represents a major limitation of those 
implementations, especially in the cases of a quickly vary- 
ing DDE such as the ionosphere where the convolution 
function calculation would largely dominate. 

4.4.3. Hybrid W-Stacking and A-Projection 

As explained in Sec. 13.31 for the W-staking implementa- 
tion, the baseline-time-frequency-dependent oversampled 
convolution function is the Fourier transform of the zero- 
padded image-plane product of the spheroidal, the A-term 



Gridding time 
CF estimate time 




w-coordinate of the baseline (m) 

Fig. 8. Due to their different orientation, all LOFAR 
station have a different synthesized primary beam. The 
corresponding convolution function for applying the A- 
projection are therefore baseline-dependent, and we have 
to compute them on the fly. This figure is showing the 
computing time of both the gridding and convolution func- 
tion estimate time for a 10 degree field of view, a maxi- 
mum w-coordinate to be 10^ meters, and a time- window of 
Twindow = 1200 seconds. When the DDE quickly varies, a 
convolution function is often required, and the convolution 
function computing time can largely exceed the gridding 
computing time. 



and a W-term accounting for the Aw-distance between 
the given visibilities and the corresponding w-plane. The 
bigger is the support of the convolution function and the 
less w-planes we need to fully correct for the w-term. As 
explained in Appendix [B] in order to properly sample the 
w-term in the image plane, to a given w-coordinate and 
field of view correspond a convolution function support. 
We can then obtain Aw = a.Ns/D^ (with a = V2/(47r)), 
and the needed number of w-planes between —Wmax and 
Wmax is Nw = WmaxD"^ / {ci-Ns). We computc the W-term 
convolution in the image plane so this step cost goes as 
tw oc Np^^{l + 2log{Npix))- The total computing time is 
then: 



ttot - bN,,sNl + cNbufO^NllogiONs) 

+Nel [tei + Nwtw] 



(15) 



where Nei is the number of time/frequency buffers in 
which the element-beam is assumed to be constant, N^uf 
is the number of time / frequency buffer where the A- / Aw- 
term are assumed to be constant, 6 and c are constants. 
In Tab. [1] we present the typical computing times for a 
major cycle with the implementation discussed here and 
presented in Sec. 
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Table 1. Computational performance of our fastest A-Projcction implementation described in Sec. 13.31 and 14.4.^ 
Columns display the various performance corresponding to different imaging settings. Those tests were performed for 
a 12 subbands dataset (1 channel per subband), on the CEP2 LOFAR cluster node each havin 24 AMD Opteron(tm) 
6172 Processors, and 64 Gb of RAM memory, tf^j is the total for a gridding or degridding step. The times tcp, 
tgrid, tel, tw E^re given in fraction of tfjf, and correspond to the times needed to compute convolution functions, to 
gridding/degridding the data, to apply the element beam, and to apply the W-term respectively. We are confident we 
can still win a factor of > 2 — 4 with respect to those times values. 
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5. Summary and future work 

The new generation of low-frequency interferometers 
(EVLA, MWA, LWA) and precursors of the SKA 
(LOFAR, ASKAP, MeerKAT) cannot be properly used 
without the development of new techniques to calibrate 
and apply in the imaging step the many direction depen- 
dent effects infiuencing the electro-magnetic field. These 
effects mainly include the antenna/stations complicated 
beam effects, ionosphere phase and amplitude fast varia- 
tion and associated Faraday rotation. 

In this paper we have mainly discussed the issues as- 
sociated with the application of the LOFAR elementary 
antenna and station beams, which involve the usage of a 
few levels of phase arrays. Using the Measurement equa- 
tion formalism, the associated high complexity (wide field 
of view, individual station rotations, projection of the el- 
ementary dipoles on the sky), the problem of imaging 
and deconvolution of LOFAR calibrated dataset can be 
so lved by applying t he A- Pro.jection algorithm described 



Bhatnaear et al. ( 20081 ). Due to its very large field of 



view ('--^ 5 — 10 degrees in diameter), a full-polarization 
A-Projection implementation dealing with non-diagonal 
Mueller matrices is needed for LOFAR. In this paper 
we have shown that A-Projection can indeed deal with 
the heavily baseline-dependent non-diagonal direction- 
dependent Mueller matrices associated with LOFAR base- 
lines. We have also demonstrated that effiscient iono- 
spheric correction can be performed using A-Projection. 
We have proposed a series of implementations of A- 
Projection for LOFAR taking into account non-diagonal 
Mueller matrices, aiming at accuracy and computing effi- 
ciency. 

5.1. Optimizations 

As explained in Sec. l4.4l the DDE although varying quickly 
are smooth on the sky, so the convolution functions have 
a small support. However, in the case of LOFAR the wide 
field of view imposes to take into account (i) the W-term, 
and (ii) the off-diagonal terms of the Mueller matrices 
(due to the varying projection of the dipoles on the sky. 



or to the Faraday rotation). In addition (iii) the baseline 
dependence make the storage of the convolution functions 
to be prohibitive and those have to be computed on the 

fly. 

In our first implementation (Sec. l3.1() the constraint (i) 
makes the W-term convolution function support to often 
dominate, while (ii) requires to set up a complicated ma- 
chinery in the gridding and degridding step, taking into 
account all polarizations to correct for polarization leak- 
age. In the case of a quickly varying DDE such as the 
ionosphere, the step (iii) can completely dominate the 
computing time (usually set by the gridding/degridding 
times). 

Using the fact that LOFAR station's elementary anten- 
nas (responsible for the complicated projection effects) are 
parallel although the station's layout are rotated, we could 
separate the first implementation in two steps (Sec. \S7I\\ . 
The first is a purely scalar gridding/degridding and suffers 
from (i) and (iii), while the second is only affected by (ii). 
The non-diagonality of the Mueller matrix is corrected on 
the baseline-stacked uv-plane, so we win a factor between 
10 and 16 as compared to the first implementation. It 
is important to note that this optimization breaks down 
when the non-diagonal Mueller matrix becomes baseline- 
dependent such as in the cases of very long baselines (due 
to the earth's curvature), or to the ionosphere's differential 
Faraday rotation. 

In the last implementation (Sec. 13. 3|) . we still ap- 
ply the direction-dependent non-diagonal Mueller matrix 
of the baseline-independent element-beam, and go fur- 
ther by separating the W-term from the A- Term. The 
former is responsible for the large support sizes and is 
baseline independent (around a given w-plane), while the 
later has small support and is baseline dependent. In 
this implementation, we grid/degrid the data based on 
a small support baseline-dependent oversampled convolu- 
tion function, and convolve the input or the output grids 
(forward and backward steps respectively) with the non- 
oversampled W-term convolution function. We save com- 
puting time by computing the oversampled convolution 
function of a generally much smaller support, and the net 
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gain lies between 2 and 10 as compared to the seconds 
implementation fSec. 13.21 and Tab.[T]). 

Such optimizations are vital for the feasibility of the 
LOFAR extragalactic surveys, given their huge integration 
times totalizing hundreds of days. For the SKA, it will 
be important to take the algorithmic optimization into 
account, as they are linked to the instrument and system 
architecture. They can reduce the algorithmic complexity 
by orders of magnitude. 

5.2. Real LOFAR data and future work 

We have conducted many experiments with LOFAR in 
order to test the our imager on real LOFAR data. 
Specifically, we have observed the same set of objects, 
in different observations having their pointing centers 
shifted by few degrees. Compared to a traditional imager 
(CASA) , our implementation of A-Projection gives coher- 
ent corrected flux densities at the level of 5-10% on the 
edge of the field. However, although we have shown in 
this paper that the algorithm is giving excellent results 
on simulated dataset, it shows little or no dynamic range 
improvement in the resulting images, as compared to a 
traditional imager. 

LOFAR is however a very complex machine and feed- 
ing the imager with a wrong direction-dependent calibra- 
tion solution will not lead to any improvement in the de- 
convolved sky, and can of course even decrease the dynam- 
ical range. In order to improve LOFAR's dynamic range 
we will have to make progress in understanding the cali- 
bration aspects of DDE, especially those related to iono- 
sphere and differential Faraday rotation. Much effort is 
spent on that direction, and DDE calibration algorithms 
of low complexity are under deve lopment or have already 
been achieved such as SAGECal (|Yatawattall2008l ). 

On the imager side, further ongoing development in- 
clude (i) implementation on GPU, of either or both of 
the gridding/degridding or convolution function calcula- 
tion, (ii) compressed sensing in the image plane, (iii) uv- 
plane interpolation techniques different f rom zero-padding 



FFT , and (iv) Wide-Band A-Projection (jBhatnagar et al, 
20121) . Ionosphere and true beam calibration, in combi- 



nation with pealing techniques, will hopefully allow us to 
use the framework presented in this paper to reach the 
high ~ 10^ — 10^ dynamic range needed to construct the 
deepest extragalactic LOFAR surveys, . 
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Appendix A: Measurement Equation formalism 

In order to model the complex direction dependent ef- 
fects (DDE - station beams, ionosphere, Faraday rota- 
tion, etc), we make extensive us e of the M easurement 
Equation formalism developed by iHamaker et al. (1991). 
The Measurement Equation provides a model of a generic 
interferometer. Each of the physical phenomena that 
transform or convert the electric field before the corre- 
lation computed by the correlator are modeled by lin- 
ear transformations (2x2 matrix). If s = (l,m,n 



^) is a sky direction, and ^ stands for the 



Hermitian transpose operator, then the correlation matrix 
Vpq between antennas p and q, can be written as follows: 



T/meas 

pq 



Gp (^Dp^^.Kp^s.Fs . F^.K^^,.Dl)j Gf (A.l) 



where Dp s is the product of direction-dependent Jones 
matrices corresponding to antenna p (e.g., beam, iono- 
sphere phase screen, and Faraday rotation). Gp is the 
product of direction-independent Jones matrices for an- 
tenna p (like electronic gain and clock errors). The ma- 
trix Kp^s describes the effect of the array geometry and 
correlator on the observed phase shift of a coherent wave- 
front between antenna p and a reference antenna. This 
effect is purely scalar so it is reducible to the prod- 
uct of a scalar and the unity matrix, so we can write 
Kp^s-K^s — exp {~2iTT(j){u,v,w, s)).l where(M, u, w) is 
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the basehne vector between antenna p and q in wave- 
length units and 1 is the unity matrix. We then have 
(j){u, V, w, s) — u.l + v.m + — P — -w? — 1), where 

the — 1 term models the correlator effect when phasing the 
signals in the direction of w. Finally the product Fg ■ 
is the sky contribution in the direction s and is the true 
underlying source coherency matrix [[XX, XY], [YX, YY]]. 

This elegant formalism enables us to model the full po- 
larization of the visibility as a function of the true under- 
lying electric field correlation. In a simple and consistent 
way it takes the direction dependent and direction inde- 
pendent effects into account. Indeed most of the Jones 
matrices in the measurement equations have a fairly sim- 
ple formulation and radio calibration problem amounts to 
finding the various components of G and given a sky 
model Is = Fs.F^. 

The measurement equation introduced above can be 
written in a more extended and continuous form bet- 
ter suited for imaging by applying the Vec operatoiO to 
V^°''\ We obtain: 



Vec(t/7n = /p*.,®i?p,.).Vec(/.) 

5 



. exp {—2iTT(/)pq{u, V, w, s))ds 



(A.2) 



where (g) is the Kronecker product, Vec is the operator 
that transforms a 2x2 matrix into a dimension 4 vector. 

Appendix B: Further algorithm details 

In this section we describe some important details of the 
various implementations of A-Projection for LOFAR. In 
particular, we make extensive use of the Prolate spheroidal 
function for resolution adaptation and zero-padding for 
uv-plane interpolation. 

As explained in Sec.|31 since LOFAR stations are char- 
acterized by different primary beams, the gridding and de- 
gridding steps are baseline dependent. Therefore the con- 
volution functions cannot be computed once and kept in 
memory as is done for W-projection. Instead, they have to 
be computed on the fly (see Sec. 14. 4p . However, because 
the station's beam model is fairly complex and costly 
to evaluate (coordinates transformation, estimate of the 
Element beam Jones matrix), we store in memory at the 
minimal resolution the 4-polarization image plane beams 
(their Jones matrices). The necessary resolution is simply 
estimated as X/{2.DstaHon), where Dstation is the given 
station's diameter. 

The W-term is also estimated once and stored in mem- 
ory at the minimal resolution. This amounts to finding the 
maximum frequency to be sampled in the image plane, 
and the corresponding number of pixels corresponding to 
the minimum support required for the W-term convolu- 
tion function. If the image is of angular diameter Dim, 



The Vec operator transforms a matrix into a vector formed 
from the matrix columns being put on top of each other. It 
has the following useful properties: (i) Vec(AA) — Wec{A) 
,(ii) Vec(^ + B) = Vec{A) + Vec(B), and (iii) Vec{ABC) = 
(C^® A).Vec(B). 



the necessary resolution needed to properly sample to W- 
term is the inverse of the highest spatial frequency, located 
in one of its corners. The support of the W-term is then 

In order to interpolate the visibilities on the grid in the 
gridding step (or conversely in the degridding step) , or to 
adapt the resolution of the A- and W-terms we use a zero- 
padding interpolation. This scheme can produce artifacts 
due to the presence of sharp edges and aliasing problems, 
so we have to make extensive use of a Prolate spheroidal 
function. It is computed at the maximum resolution in 
the image plane. We then Fourier transform it, find its 
support Ns{Sp}l), "cut" it to that size, and store it in the 
uv-plane (hereafter S™). 

For the various algorithms presented in this paper, we 
have to compute the products of various DDE in the image 
plane. For example for the algorithm described in Sec. 13.11 
we have to adapt the A- and W-terms resolution before 
multiplying them in the image plane. We first have to 
find the support Ns of the convolution function as in Eq. 
[131 We first compute the image plane spheroidal at the 



resolutions of the A- and W-terms (Sp 
respectively) as follows: 



Ns{A) 



and S 



Ns{W) 



(B.l) 



where T is the Fourier transform, Z" is the zero-padding 
operator that puts the input into a grid of the size n. To 
estimate the A-term interpolated on an Ns x Ns pixel 



image: 



A^vs ^ (S^s)-ij-i (z^-\J-(s^^(^'a 



(B.2) 



We obtain the image plane effects at the same resolu- 
tion and multiply them according to the specific needs of 
the various implementations described in Sec. [3] and ob- 
tain the image plane product Pim- We then compute the 
oversampled convolution function as: 



CF 



ONs _ 



(B.3) 



where the resulting interpolated convolution function 
CF'^^^' has ONs x ONs pixels with O the oversampling 
parameter. If the final image is of size Npi^ x Npix, because 
we have used the spheroidal function, after applying the 
inverse Fourier transform, we have to normalise the dirty 
image by the spheroidal function S^''" — J^~^Z^p"S™. 

As outlined above, all LOFAR stations are different, 
and the convolution functions are baseline dependent. The 
for loops described in Sec. [3] are therefore parallelized on 
baseline. For the degridding step from a common read-only 
grid, the residual visibilities are estimated independently 
using different threads. The code has been created in the 
LOFAR package and is is dependent on the casacore and 
casarest packages. 
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Appendix C: Limiting the maximum w-coordinate 

For the traditional interferometers at high frequencies, in 
general the field of view is small enough so that the W- 
term can always be neglected {w.{\/^ — P — m? — 1) ^ 0). 
However, for the wide fields of view, long baselines inter- 
ferometers, the W-term is of great importance, and not 
taking it into account produces artifacts and considerably 
reduces the dynamic range. For a given field of view, or 
a given angular distance to the phase center, the impor- 
tance of the W-term increases as the w-coordinate value, 
i. e. when the targeted field is at low elevation. It is there- 
fore important to stress that wide fields of view or long 
baselines do not directly mean that the W-term will take 
an importance: whatever the baseline or field of view, a 
planar array that would observe at the zenith would al- 
ways give a null w-coordinate. 

Algorithmically, for A-Projcction, the W-term is one 
of the main limiting factor. Using the W-Projection algo- 



rithm (jCornwell et al.l 120081 ) . assuming the W-term sup- 



port is higher that the Prolate spheroidal support, the 
gridding time evolves as t^-cid.w uP'.D'^, because the 
highest spatial frequency in the image plane has to be 
properly sampled. We found that on a typical LOFAR 
dataset this non-linear behavior generally makes the < 5% 
of the points with the highest w-coordinates to be respon- 
sible for > 70% of the computing time (as in Fig. IC.1|) . 
This little amount of data does not necessarily bring sensi- 
tivity or resolution. Setting a wmax value above which the 
visibilities are not gridded significantly increases the com- 
putational efficiency, without loosing sensitivity or resolu- 
tion. 




login(W-distance [Meters]) log^niW-distance [Meters]) 



Fig. C.l. For a given field of view the W-term in- 
creases for lower elevation of the target field. Using the W- 
Projection algorithm, the computing time increases with 
w'^ . This figure shows the normalized cumulative distribu- 
tion of the w-coordinate (left panel) for a typical LOFAR 
observation, and the corresponding normalized cumulative 
computing time (right panel). We can see that rejecting 
the ^ 5% of the points with w > 10* saves ~ 70% of the 
computing time. 



